15. Full-Text Searching in Memo Fields
While FileFlex character fields are fixed in width at the time of database
design, and can store only up to 256 characters each, memo fields have no
such restrictions.
FileFlex memo fields hold up to 32K and are allocated in 512-byte increments.
As a result, you can comfortably place descriptions, notes, reports, letters,
and other long information into FileFlex memo fields.
Easy Full-Text Search
FileFlex provides a simple, yet powerful function that allows you to find
text anywhere in a memo field. The DBFindMemo function searches the specified
memo field in the current database for the string specified. This is something
of a brute-force searching mechanism. FileFlex checks every record, scanning
each field of each record for the matching search string:
DBFindMemo("NOTES","status on FileFlex") into dbResult
Building a Complex Indexed Full-Text Search
If your database is very large, you'll find that DBFindMemo will take some
time to find your record. But you can combine a number of FileFlex features
together to create a rather sophisticated fully-indexed full-text search.
Note: Unlike normal indexes, this mechanism doesn't dynamically update
the full-text index unless the index is rebuilt. Therefore it's most appropriate
for use with CD-ROM or unchanging information.
There are two phases to performing a complex indexed full-text search:
- At development time, construct the index itself
- At runtime, search the index for your words
Constructing a Full-Text Index
Before you can build your index, you need to think through the indexing
architecture. The most important database is the file where your memo field
is located. Let's call this the "memo database". When the user
enters a search word, we want to move the record pointer in the memo database
to the first matching record.
You'll need to create two additional files: a database file (let's describe
it as the "word database") containing the words to be searched
and an index into the word database file.
Define the word database with two fields: a fixed width field for each word
in the memo file, and a fixed width field containing the physical record
number of the record in the memo file containing the word. Alternatively,
the you can replace the record number with any other unique key that you
can search for...but you'll need an extra index file if you use this technique.
Finally, you'll need a standard index file that uses the fixed word field
as the indexed field and the word file as the data file. So, here's what
you've got so far:
- Memo database: The database containing your memo field. This
is where the actual information you're looking for resides;
- Word database: The database containing in one field all the words
in the memo database memo field, and a field containing a pointer to the
appropriate record in the memo database;
- Index-on-word file: An index file indexing the word field in
the word database (make sure you use the UPPER intrinsic function to reduce
ignore case).
Now you have to write a utility routine that follows the following algorithm
(remember that this is a routine that you as the developer will run, not
one used by the end-user of your software):
For every record in the memo database..
Read the contents of the memo field into a variable
For every word in the variable...
Write the word and the current memo database record
number to a new record in the word database
Make sure the index file is up-to-date with the data in the
word database.
At this point, you've got everything you need to do an incredibly fast,
indexed full-text search. Let's look at how it'll run in your application
to retrieve a value.
Searching Using a Full-Text Index
You get the word to be searched via some form of user input. You then pass
that word to a full-text seek routine that does the following:
- Selects the word database (DBSelect)
- Selects the index-on-word index (DBSelectIndex)
- Prepares a seek expression, converting to upper case (DBBuildSeekExpr)
- Does the seek, moving to the first match (DBSeek)
- Grabs the pointer to the memo database (DBGetFieldByName)
- Selects the memo database (DBSelect)
- Moves (DBGo) to the physical record containing the memo field
You can, of course, write the routine to iterate through the list of matching
words if you want to build a table of all the records that contain a given
word.
[Previous Chapter] [Table of Contents] [Next Chapter]
Copyright (c) 1996 David Gewirtz under license to Component Software Corp.
All rights reserved worldwide.